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Organization-wide measurement of software products 
and processes is needed to establish full life 
cycle control over software products. The Software 
Engineering Laboratory (SEL) — a joint venture 
between NASA's Goddard Space Flight Center, the 
University of Maryland, and Computer Sciences Corpo- 
ration-started measurement of software development 
more than 1 5 years ago. Recently, the measurement 
of maintenance has been added to the scope of the 
SEL. In this article, the maintenance measurement pro- 
gram is presented as an addition to the already existing 
and well-established SEL development measurement 
program and evaluated in terms of its immediate bene- 
fits and long-term improvement potential. Immediate 
benefits of this program for the SEL include an in- 
creased understanding of the maintenance domain, the 
differences and commonalities between development 
and maintenance, and the cause-effect relationships 
between development and maintenance. Initial results 
from a sample maintenance study are presented to 
substantiate these benefits. The long-term potential of 
this program includes the use of maintenance base- 
lines to better plan and manage future projects and to 
improve development and maintenance practices for 
future projects wherever warranted. 

1. INTRODUCTION 

Most software organizations lack satisfactory control 
over their development and maintenance projects. This 
lack of control is exemplified by the absence of explicit 
models enabling the identification of ambiguous prod- 
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uct requirements, the selection of practices best suited 
to achieve given requirements, or the prediction of the 
impact early project decisions may have on the quality 
of the resulting products. Each organization has its own 
set of control problems and reasons standing in the way 
of improvement. Comprehensive measurement pro- 
grams are needed as a first step toward improvement 
[1]. Such programs can help identify the specific prob- 
lems of an organization in quantitative terms, pinpoint 
possible causes, motivate improvements, and assess 
alternatives considered for improvement. 

The Software Engineering Laboratory (SEL)— a 
joint venture including government, industry, and 
university—* began measurement of satellite ground 
support software development projects in 1976. The 
three primary organizational members of the SEL are 
the Systems Development Branch at NASA’s Goddard 
Space Flight Center, the Computer Science Department 
at the University of Maryland, and the Systems Devel- 
opment Operation at Computer Sciences Corporation. 
This collaboration has produced numerous case studies 
and controlled experiments [2-6]. Results from these 
case studies and experiments motivated several 
improvements within the SEL [7-9]. 

In 1988, the SEL incorporated maintenance into its 
scope of measurement. The result is an even more 
comprehensive measurement program in which data is 
now being collected during development and mainte- 
nance of all software systems. In the SEL, pre- and 
posdaunch maintenance activities are performed by 
separate organizational entities. Currently, maintenance 
data are only collected from prelaunch maintenance 
activities. In the remainder of this article, the term 
“maintenance” shall refer to this prelaunch phase 
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between delivery of a completed software system and 
the actual launch of the related spacecraft. This mainte- 
nance measurement program is customized to the perti- 
nent SEL characteristics, including the definition of 
maintenance, the maintenance improvement goals, and 
other product, process, and people factors. 

Empirical research in the SEL is based on the idea of 
continuous improvement. This idea has been formu- 
lated as the quality improvement paradigm [1]. Accord- 
ing to this paradigm, improvement is the result of 
continuously understanding current practices, changing 
them, and empirically validating the impact of these 
changes. Improvement requires measurement. 

In the SEL, measurement goals define the data to be 
collected and provide the context for data interpreta- 
tion. This goal-oriented approach to measurement has 
been formulated as the goal /question/metric paradigm 
[1, 10, 11]. It suggests defining each goal by develop- 
ing a set of analysis questions, which in turn lead to a 
set of metrics and data. The short-term goals of our 
maintenance measurement program have been to 
increase the understanding of maintenance within the 
SEL; the long-term goals are to stimulate improve- 
ments in the SEL's ability to plan and manage future 
maintenance projects and— whenever needed— to moti- 
vate the use of different development and maintenance 
practices. 

Specific characteristics of the SEL maintenance envi- 
ronment as well as the comprehensive scope of our 
measurement approach make this program unique. The 
study results presented here may not be directly compa- 
rable to those from other maintenance environments, 
yet they do show how a comprehensive measure pro- 
gram can be used to better understand and improve an 
organization’s development and maintenance process 
and products. Few comprehensive maintenance studies 
have been published [12-14]. Most empirical mainte- 
nance studies report on laboratory-style controlled 
experiments [15, 16], isolated case studies [13, 17], or 
project surveys [18]. A survey of maintenance studies 
has been published by Hale and Haworth [19]. 

The purpose of this article is to state our initial 
maintenance study goals and questions, present the 
related results, and propose— based on what we have 
learned— a revised set of goals and questions for future 
studies. 

The study results are organized according to the 
types of data used to address the goals and questions: 
quantitative maintenance baselines, comparisons 
between quantitative development and maintenance 
baselines, and qualitative information regarding the 
cause-effect relationships between development and 
maintenance. These results have increased our under- 
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standing of maintenance processes and maintained 
products in the SEL, commonalities and differences 
between development and maintenance, and develop- 
ment characteristics affecting maintenance. On occa- 
sion, our results are carefully compared with results 
from other published studies or widely believed mainte- 
nance myths. 

We begin our presentation with a background discus- 
sion of the SEL and the new maintenance measure- 
ment program (sections 2 and 3, respectively). We then 
present the results of our study (section 4). We con- 
clude with an assessment of the SEL maintenance 
measurement program and a revised set of goals and 
questions for future maintenance studies. 

2. THE SEL 

The goals of the SEL are to understand its software 
development processes, to measure the effects of vari- 
ous methods and tools on these processes, and to 
identify and then apply new, improved development 
practices. Improved understanding within this particu- 
lar environment provides the basis for better planning 
and management as well as a rationale for adopting new 
practices [4]. 

Development in the SEL supports satellite missions. 
SEL studies generally focus on attitude ground support 
systems and their associated simulators. These product 
lines are very stable: the system architecture, documen- 
tation standards, and organizational responsibilities do 
not change significantly from one mission to another. 
Attitude ground support systems have 130-240K lines 
of FORTRAN source code (where a line of code is 
measured as a physical line, including comment lines) 
and require 15-30 staff years to develop. Simulators 
have 25-75K lines and require 3-10 staff years to 
develop. 

Research in this environment is guided by two basic 
paradigms: the quality improvement paradigm (QIP) 
and the goal /question /metric paradigm (GQM). The 
QIP, which applies the principle of continuous im- 
provement to software engineering, defines the context 
for measurement within the SEL [1]. Accordingly, 
software development can be improved by iterating the 
following steps for each project: (1) characterize the 
corporate environment; (2) state improvement goals in 
quantitative terms; (3) plan the appropriate develop- 
ment practices and methodologies together with mea- 
surement procedures for the project at hand; (4) perform 
the development and measure, analyze, and provide 
feedback; and (5) perform postmortem analysis and 
provide recommendations for future projects. Each QIP 
iteration is characterized by its own set of goals. These 
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goals reflect — and evolve with — the maturity of the 
investigated organization. 

Measurement in the SEL is guided by the GQM 
paradigm [10]. Measurement is used to characterize 
current development practices, monitor and manage 
development projects, identify strengths and weak- 
nesses of the current practices, and evaluate promising 
new technologies in a controlled environment. The 
GQM paradigm describes a goal-oriented approach 
to measurement in which metrics are tied to spe- 
cific measurement goals. According to the GQM para- 
digm, each measurement goal is listed explicitly, 
a set of specific questions is posed to address each 
goal, and specific metrics and measurement proce- 
dures are defined to support the questions. The result- 
ing data collection procedures and interpretations are 
tailormade to the study’s goals and local environment 
characteristics. For instance, in the SEL, this generally 
means that metrics and measurement procedures reflect 
the use of SEL-specific development practices, fit the 
organizational structure, and permit comparisons with 
historical data. Goals, questions, and metrics provide a 
context that helps ensure that data are interpreted cor- 
rectly and are compared only to data and results from 
similar contexts. 

Two types of measurement are common in the SEL: 
routine monitoring and exploratory studies. Routine 
monitoring is used to characterize the local environ- 
ment broadly. The resulting quantitative and qualitative 
baselines are used to plan and manage new projects and 
to compare the effects of newly introduced tools or 
methods against [6]. Objective and subjective data are 
routinely gathered for each project [20] . Objective data 
include staff hours, computer utilization, source code 
growth, and the number and kinds of changes made to 
the source code. Subjective data characterize the soft- 
ware development process and software product charac- 
teristics. The data for over 100 projects monitored 
over the last 15 years is maintained in the SEL 
database [21]. 

Exploratory studies are used when the SEL is in the 
initial phase of understanding a process or methodol- 
ogy. For example, the SEL is currently studying three 
projects following the cleanroom methodology [22]. 
Special data collection procedures were designed for 
these projects to permit researchers to monitor the 
effort spent in reading and reviewing designs and code. 

Measurement in the SEL has provided a rationale for 
making evolutionary changes to NASA’s development 
practices, including stricter use of code-reading tech- 
niques [5], guidelines for Ada projects [23], and the 
adoption of the cleanroom development approach [24]. 
With the addition of maintenance measurement,' the 
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SEL is attempting to lay the foundation for similar 
improvements in maintenance. 

3. THE SEL MAINTENANCE MEASUREMENT 
PROGRAM 

The following subsections describe the SEL mainte- 
nance environment and the specific goals and proce- 
dures of our measurement program. A more detailed 
description of this environment, its products, and main- 
tenance processes appeared in the proceedings of the 
1989 IEEE Conference on Software Maintenance [25]. 

3.1 Maintenance Environment 

In the SEL, maintenance is partly defined by organiza- 
tional responsibility and schedule. As depicted in 
Figure 1, each product passes through three different 
organizational units during its lifetime: analysts produce 
the initial functional specifications used by the deve- 
lopers and remain responsible for these speci- 
fications throughout development and until launch: 
operations assumes complete responsibility after 
launch. During the period between development 
and launch, the analysts have complete responsibility 
for the system, including the implementation of any 
changes. 

In this study, maintenance refers specifically to soft- 
ware change activities performed by the analysts during 
the postdevelopment, prelaunch phase. By nature of 
these constraints, the maintenance phase is typically 
shorter in the SEL than in other environments (one to 
two years), and the maintenance changes are not trig- 
gered by operational failures but by failures detected 
during simulated uses of the software by prospec- 
tive operators and externally triggered changes of the 
overall satellite mission. 



Figure l. Organizational structure of the SEL environment. 
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The products maintained are the same simulators and 
attitude ground support systems described in section 2. 
Typically, the effort expended during the one- to two- 
year time frame that these systems are in maintenance 
is approximately 5 % of the development effort. Mainte- 
nance procedures vary from project to project depend- 
ing on the type of system being maintained, the size of 
the maintenance team (2-10 people on the projects 
studied), the specific methods and tools elected by the 
individual programmers, and other factors. In general, 
formal change control procedures are followed; changes 
are implemented one at a time, but may be tested 
in groups; and one maintainer is responsible for 
implementing each change. 

3.2 Maintenance Measurement Goals 

Consistent with the overall directions of the SEL, we 
chose three general goals for the maintenance measure- 
ment program: (1) to understand maintenance processes 
and products better; (2) to improve our ability to 
manage current maintenance projects and plan future 
ones; and (3) to establish a sound basis for in proving 
development from a maintenance perspective. 

Following the QIP, the initial goals focus on under- 
standing maintenance. Representative measurement 
goals and questions selected for this study are summa- 
rized in Figures 2, 5, and 11. Analysis results related to 
these goals and questions are presented in section 4. 

3.3 Maintenance Measurement Procedures 

The data collection procedures used in this study were 
designed according to the principles of the GQM 
paradigm. Data were collected via exploratory inter- 
views and routine data collection forms [20]. The rou- 
tine data collection forms used during maintenance 
include the Weekly Maintenance Effort Form and the 
Maintenance Change Report Form (Appendix A). The 
effort form is filled out once per week per maintainer 
per system; one change form is filled out per completed 
change. The weekly effort forms record the distribution 
of effort (in staff hours) by type of change (correction, 
enhancement, adaptation, or other 1 ) and by engineering 
activity (designing, coding, etc.). The change forms 
record the distribution of changes by type of change, 
size of change, changed objects (e.g., code, user’s 
guide), expended staff time, fault type (if applicable), 
and more. Ail data are validated through a series of 


’All maintenance effort that cannot be attributed to an individual 
maintenance change is classified as ‘'other.'* This includes effort 
related to management, meetings, and training. 


checks by the data entry personnel, project managers, 
and SEL researchers. Data are stored and made avail- 
able to researchers and developers through the SEL 
database [21]. 

4. MAINTENANCE MEASUREMENT BENEFITS 

The maintenance measurement program has already 
increased understanding of maintenance in the SEL. 
Previously, much of this understanding was at best 
intuitive and approximate. In this section we demon- 
strate what we have learned as a result of our initial 
study. The results are separated into baseline character- 
izations of maintenance, a comparative analysis of 
development and maintenance, and an analysis of how 
development decisions affect maintenance. 

In this study, we restrict our analyses to three large 
attitude ground support systems for which we have 
complete and valid data: the Gamma Ray Observatory, 
the Geostationary Operational Environmental Satellite, 
and the Cosmic Background Explorer. Maintenance of 
these systems was performed between 1988 and 1991. 
A total of 90 changes and over 10,000 hours of effort 
serve as the basis for all quantitative analyses of main- 
tenance presented here. 

Examining the data on these three projects has pro- 
vided valuable insight into the maintenance process 
within this environment. The results presented here are 
intended to demonstrate the increased understanding 
of the maintenance process that can result from a 
measurement program. 

4.1 Maintenance Baselines 

The first step toward understanding any environ- 
ment is to develop baselines describing that environment 
[12, 14]. The goals and questions related to this part of 
the SEL study are listed in Figure 2. They are intended 


GOAL 1 : Characterize the changes performed during 
maintenance. 

QUESTION 1 

How many changes of each type are completed? 
QUESTION 2 

How much effort is spent on changes of each type? 
GOAL 2: Characterize product evolution during 
maintenance. 

QUESTION 3 

How much code is affected by each change? 

QUESTION 4 

Is code added, changed or deleted? 

GOAL 3: Characterize the maintenance process stability 
QUESTION 5 

How do maintenance processes differ across projects? 

Figure 2. Measurement goals for understanding maintenance 
in the SEL. 
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to characterize what kinds of changes are performed 
during maintenance, which parts of the systems change 
and how, and what maintenance processes are fol- 
lowed. In the long term, the resulting baselines are 
expected to provide a basis for determining whether 
new techniques or process adjustments have any mea- 
surable impact on the SEL maintenance processes or 
products. Any comparison between SEL baselines and 
baselines from other environments must take environ- 
mental differences into account. 

Each maintenance change in this environment is well 
defined by a formal change request. There are several 
key steps in the change process: changes must be 
approved, implemented, tested, and released. In gen- 
eral, more changes are approved than can be imple- 
mented. This poses the difficult management problem 
of selecting which changes to implement. This decision 
is based on the importance of the changes approved as 
well as the budget available to make changes. The 
implementation of a change is performed by one pro- 
grammer; there is no standard, formal methodology. 
Testing, beyond debugging by the programmer, is per- 
formed for several changes at once. One important 
implication is that the associated effort measured cannot 
be ascribed to a particular change. In fact, testing is 
typically performed at two levels: the first level pro- 
vides internal checkpoints for configuration manage- 
ment; the second level occurs before each release. 

Each maintenance change performed in the SEL is 
classified as an enhancement, adaptation, or correction 
[26]. A simple count of changes suggests that mainte- 
nance is primarily corrective; however, the effort distri- 
bution reveals that most effort is actually related to 
enhancements (Figure 3). Either way, adaptations do 
not seem to contribute significantly (Figure 2, questions 
1 and 2). Note that the average enhancement requires 
just over twice the effort of the average correction. 
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This phenomenon could be caused by the fact that 
enhancements are typically larger than corrections, 
that enhancements are inherently more difficult to 
accommodate into an existing system, or both. 

As early as 1976, Belady and Lehman [14] demon- 
strated the benefits of program evolution models for the 
purpose of understanding the decay of software under- 
going change. Figure 4 summarizes how many modules 
and lines of source code have been added, changed, or 
deleted per change (Figure 2, questions 3 and 4). On 
average, three lines of code are added for every exist- 
ing line changed or deleted. Entire modules are rarely 
added and never deleted. In the SEL, maintainers do 
not significantly alter the system’s architecture to make 
changes. We hypothesize that the high number of lines 
added reflects the high proportion of enhancements, 
and that architectural stability reflects an “if it ain’t 
broke don’t fix it” attitude. Such an attitude could be 
explained by the general lack of understanding of over- 
all system architecture. The observed growth pattern 
also suggests that module functionality increases during 
maintenance, leading to a decrease in module cohesion. 
Decreased cohesion may not be a problem during the 
short lifespan of a satellite system, but may reduce 
the reuse potential of modules in future developments. 

Our most striking observation about SEL mainte- 
nance is the extent to which the maintenance processes 
vary across similar projects (Figure 2, question 5). 
Some of the variability reflects the size and composition 
of the maintenance teams (2-10 programmers). One 
particular area where the processes differ appears to be 
in the approach to testing. The projects studied have 
not established well-defined criteria for when system or 
integration testing should be performed during mainte- 
nance. Such variability in the process reflects the rela- 
tively ad hoc nature of the maintenance environment as 
compared to the development environment. In fact. 
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Figure 3* Distributions of effort and number of changes by type. 
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Lines of Code Modules 

Figure 4. Lines of code and modules per change. 


studies such as this one aim at increasing the maturity 
of the maintenance process within this environment. By 
identifying which aspects of the process are most suc- 
cessful, a single consistent process will be identified. 

4.2 Maintenance vs. Development 

Applying experience from past development studies to 
maintenance requires an understanding of the similari- 
ties and differences between maintenance and develop- 
ment. The goals of this part of the study were to 
compare changes made during development and main- 
tenance. types of changes, and change processes 
(Figure 5). These comparisons are possible because 
both development and maintenance data are available 
for the three systems studied. 

Throughout development and maintenance, the effort 
spent on each change is recorded. Effort is classified 
as easy when it takes less than an hour to complete 
a change, medium when it takes between an hour and a 


GOAL 4: Compare changes made during development and 
maintenance. 

QUESTION 6 

How does the effort per change compare? 

GOAL 5: Compare the types of changes made to products 
at both phases. 

QUESTION 7 

Are the faults found during maintenance different than 
those found during development? 

QUESTION 8 

How do the distributions of errors by class 
compare? 

GOAL 6: Compare change processes at both phases. 
QUESTION 9 

How does the distribution of effort by activity type 
compare? 

Figure 5. Measurement goals for understanding the similari- 
ties and differences between development and maintenance. 


day, and hard otherwise. A distinction is made between 
the effort to isolate a change (understand the request 
and locate the affected modules) and the effort to com- 
plete the change (design, code, test). Figure 6 shows 
that changes performed during maintenance generally 
require more effort than those performed during devel- 
opment (Figure 5, question 6). We consider two 
hypotheses that might account for this pattern: changes 
requested during maintenance are inherently harder 
than those requested during development; and it is 
more difficult to perform the same change during main- 
tenance than it would be during development. While we 
cannot determine whether particular modules are easy 
or difficult to change during maintenance based on our 
data, we are able to examine both hypotheses further at 
the level of the individual change. 

Regarding the first hypothesis, we find no obvious 
difference between the effort distribution patterns for all 
changes (Figure 6) and corrections only (Figure 7). We 
conclude that the increased effort is not primarily due to 
differences in the distributions of types of changes 
requested. 

Regarding the second hypothesis, various character- 
istic differences between development and maintenance 
are commonly thought to explain why the same change 
might be more difficult to perform during maintenance. 
These include product factors (such as increased com- 
plexity and missing or out-of-date documentation), pro- 
cess factors (such as schedule constraints, methods, and 
tools), and people factors (such as a lack of familiarity 
with the software). In the SEL, we cannot attribute the 
maintenance difficulties to product factors because there 
is already a sharp increase in change effort during 
acceptance test, but little change in the products. 
Instead, we suspect some combination of process and 
people factors. Although we are unaware of any sig- 
nificant methodological differences between the 
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Easy Madftjm Hard 

Effort to Complete Changes 


Figure 6. Effort to isolate and complete changes: mainte- 
nance vs. development. Easy, {1 hour; medium, )1 hour and 
(1 day; hard, )1 day. 

way a change is implemented during development 
or maintenance, development has a much higher 
rate of change activity: these systems average 
over 1,000 changes during testing. Although the high 
number of changes may increase certain costs (e.g., 
configuration control), it may actually reduce others 
(e.g., testing is not repeated once for every change). 
Maintained are not only generally unfamiliar with the 
systems they maintain, but the volume of maintenance 
may be insufficient to develop such familiarity. We 
expected the unfamiliarity with the maintained systems 
to have a more dramatic impact on the isolation activity 
(which might require an understanding of the entire 
system) than the completion activity (which typically 
requires only an understanding of individual modules). 
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Easy Medium Hard 

Effort to Complete Faults 


Figure 7. Effort to isolate and complete faults: maintenance 
vs. development. Easy, (1 hour; medium, >1 hour and (1 day; 
hard, )1 day. 

Instead, we discovered a proportional increase in both 
isolation and completion efforts (Figure 6). This may 
be explained by the fact that SEL maintained are 
experts in the application domain, not software devel- 
opment; therefore, they may be expected to readily 
understand the change specifications, but not the code. 

Both during development and maintenance a signifi- 
cant fraction of the changes are corrections (Figure 3). 
Figure 8 shows that the types of faults corrected during 
development and maintenance are similarly distributed 
(Figure 5, question 7). During maintenance, more cor- 
rections are related to incorrect initialization (21 vs. 
17%) and logic (25 vs. 19%), but fewer are related to 
incorrect interface (19 vs. 22%), data (26 vs. 28%), 
and computation (9 vs. 14%) as compared to develop- 
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Initialization Logic interface Data Computational 

Figure 8. Number of faults per class: maintenance vs. development. 


ment (see [20] for definition of classification scheme). 
Some of the differences seem to be related to the 
organizational structure of the environment. Mainte- 
nance is performed by people more familiar with the 
application domain and less familiar with the solution 
domain. The opposite is true for the developers. In this 
environment many application-specific parameters are 
rejected in the software as initialization parameters. As 
such, they require a clear understanding of the applica- 
tion, and faults are more easily found by maintainers. 
The opposite is true for typical solution faults such as 
interface and computational faults. 


Figure 9 shows that the distributions of errors differ 
significantly between maintenance and development 
(Figure 5, question 8). During maintenance, many 
more faults are attributed to inappropriate requirements 
or specifications (26 vs. 3%). and a few more are 
attributed to inappropriate design (11 vs. 8%): fewer 
are attributed to inappropriate implementation (55 vs. 
79%) or previous changes (2 vs. 10%). In attempting tc 
explain these differences, the following hypotheses have 
been formulated. Few faults are attributed to previous 
changes during maintenance because maintainers are 
unaware of changes made during development and the 
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total number of changes during this phase is low. The 
high proportion of faults attributed to the requirements 
or specifications reflects the nature of the testing: appli- 
cations experts are now using the systems to prepare 
for the missions, whereas during development most 
testing is performed by the developers themselves. 

During development and maintenance, effort data 
is collected according to the following process model: 
isolation (understanding a requested change and 
identifying the affected modules), design (proposing 
a change), implementation (implement the proposed 
change), unit and system test (testing the changed mod- 
ules and system), and acceptance test (testing a set of 
related changes). The development data include all 
effort; it is not limited to changes. 

Figure 10 shows that during maintenance, more effort 
is spent on design activities, about the same amount 
of effort is spent on implementation activities, and less 
effort is spent on testing activities (Figure 5, question 
9). The increase in design effort may be explained by a 
lack of familiarity with the system structure, resulting 
in increased effort to isolate changes. The decrease in 
testing effort may be explained by different testing 
procedures. During maintenance, integration testing is 
almost absent because the system structure doesn’t 
change much, and acceptance testing is performed for 
groups of changes together. 

How do these results compare with similar findings 
published in the literature? While comparing baseline 
data across environments is difficult, some patterns are 
evident. The increased cost of maintenance changes and 
corrections has been noted previously by many authors 
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[22, 27]. This lends support to the claim that faults 
introduced during design but discovered during mainte- 
nance may cost significantly more than if discovered 
and corrected earlier in the life cycle [27] . As has been 
noted in other environments [28], we find that mainte- 
nance changes in the SEL require more “up-stream” 
(i.e., design) than “down-stream” (i.e., testing) effort). 

4.3 Development for Maintenance 

As a final result of the maintenance measurement pro- 
gram, the SEL has enhanced its understanding of the 
impact of development decisions on maintenance 
(Figure 11). This increased understanding is illustrated 
by our initial findings concerning the complexity of 
delivered products and the quality of their documenta- 
tion. The qualitative results of this section are based 
primarily on subjective data from exploratory inter- 
views. Nevertheless, they are essential during the early 
phases of a measurement program for guiding future 
improvement cycles. 

Our initial inquiries have revealed complexity prob- 
lems related to intermodule structure and the encoding 
global information (Figure 11, question 10). Main- 
tained reported major problems related to the fact 
that global information was encoded redundantly. For 
example, constants were encoded in multiple 
FORTRAN common blocks. Software modification fre- 
quently resulted in inconsistent representations of global 
information. 

Two recurrent documentation problems have been 
identified (Figure 11, question 11). These concern the 
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GOAL 7: Characterize the impact of the delivered product 
on maintenance. 

QUESTION 10 

What structural product characteristics have positive/ 
negative effects on maintenance? 

QUESTION 11 

What product documentation standards have positive/ 
negative effects on maintenance? 

Figure 11. Measurement goal for understanding the effects 
of development on maintenance. 


use of program design language (PDL) and debug 
statements. PDL descriptions of each module are 
included in the source code as a header. Most maintain- 
ed regard PDL as redundant. Furthermore, the deliv- 
ered PDL is usually outdated. In the SEL environment, 
developers are required to keep their design PDL as 
part of the software module. Unfortunately, this PDL is 
frequently obsolete by the time the module reaches the 
maintenance phase; thus, it is useless to the maintain- 
ed. Also, the majority of people maintaining the soft- 
ware suggested that this practice be stopped entirely, 
since the same level of abstraction is provided to them 
in the code structure and comments. 

Many maintained suggested that the debug interface 
of the code be improved. Because attitude ground 
support software is highly computational, an exten- 
sive debug interface is provided with each system. The 
problem with the current debug interface is that fre- 
quently it assumes intimate familiarity with the code in 
that the output was of the form (variable) = (value). 
Maintained suggested that future debug interfaces 
provide a more descriptive explanation of the output 
printed. 

As we learn more about the problems maintained 
have with the software delivered from development and 
identify solutions to these problems, the guidelines and 
standards for development [7-9] will be modified to 
reflect these recommendations. 

5. SUMMARY AND CONCLUSIONS 

In this section, we summarize the benefits of the main- 
tenance measurement study for the SEL, outline future 
maintenance measurement directions within SEL, and 
package some of the general lessons learned about 
establishing measurement programs for use in other 
maintenance environments. 

5.1 SEL Maintenance Study Benefits 

The most immediate benefit of this program has been 
an enhanced undedtanding of the SEL maintenance 


environment. The quantitative baselines presented in 
the preceding section resulted in a better understanding 
of maintenance requests, maintained products, and 
maintenance processes. They enabled us to identify 
weaknesses in the SEL maintenance environment. 

The comparison between changes performed during 
development and maintenance has helped us understand 
where we may benefit from existing development base- 
lines. For example, whereas the distributions of faults 
corrected during development and maintenance are sim- 
ilar, effort distributions are not. This suggests that 
reuse of lessons learned from development is more 
justified when they pertain to faults than when they 
pertain to effort. 

Baselines may also be used to compare the effects of 
new development technologies on maintenance. For 
example, both cleanroom and an Ada/object-onented 
design approach have been applied on recent develop- 
ment projects with the expectation that 41 more reliable*' 
systems will result. We are now in a position to vali- 
date these expectations by comparing the effects of the 
new approaches to traditionally run projects. 

In the long term, development and maintenance are 
expected to improve as a result of our increased under- 
standing. At this point, recommendations for improve- 
ment are based predominantly on qualitative feedback 
from maintainers (rather than quantitative measurement 
baselines). Most of these suggestions have to do with 
the separation of the development aiui analysis organi- 
zations (Figure 1) and the absence of standard mainte- 
nance processes. The separation of development and 
maintenance means that a maintainer is entirely depen- 
dent on the code and documentation acquired at the 
time of delivery [29]. Consequently, inadequacies in 
the code or documentation are much more of an obsta- 
cle to maintenance than in an organization where main- 
tenance and development are more closely related. 
Each maintenance change is performed by one indi- 
vidual without much guidance regarding the main- 
tenance process itself. The ad hoc nature of the 
maintenance processes makes it hard to measure, com- 
pare measurements, and make recommendations. We 
expect our measurement program to contribute to the 
standardization of maintenance processes over time. 

Overall, the SEL maintenance measurement program 
is perceived as successful and beneficial to this particu- 
lar environment. The lessons learned from our study 
have resulted in changes and additions to the SEL 
standards and policies for software development [8]. 
Because numerous new projects are always under 
development in the SEL, we will be able to examine 
whether the revised standards have a measurable impact 
on the quality of the development product. 
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5.2 Future Maintenance Research 

As we continue to learn about the SEL maintenance 
environment, numerous future measurement directions 
become evident. Some directions reflect changes in 
the environment itself, others reflect changes in our 
understanding of the environment. We must continu- 
ally revise our goals, questions, metrics, and proce- 
dures to reflect the current priorities and understanding. 
Figure 12 contains an example set of revised questions 
for each of our seven maintenance goals to guide future 
maintenance studies. 

We must continue to revise our measurement pro- 
gram in response to previous misconceptions inherent 
in our initial qualitative models of maintenance process. 
For example, our current effort classification scheme 
does not explicitly recognize configuration management 
as a discrete activity. This effort is grouped together 
with nontechnical activities such as meetings and man- 
agement. In the future, we may want to update our data 
collection forms to include configuration management 
as a separate activity, since it seems to represent a 
significant portion of current maintenance effort. 


GOAL l : Characterize the changes performed during 
maintenance. 

QUESTION 1 

How many changes of each type are requested by 
different sources (e.g., analyst, operator)? 

GOAL 2: Characterize product evolution during 
maintenance. 

QUESTION 2 

How does coupling/cohesion change during 
maintenance? 

GOAL 3: Characterize the maintenance process stability. 
QUESTION 3 

Which process factors determined process stability 
(e.g., staffing level, familiarity with system)? 

GOAL 4: Compare changes made during development and 
maintenance. 

QUESTION 4 

What is the average change effort per module during 
each phase? 

GOAL 5: Compare changes made to products at both 
phases. 

QUESTION 5 

What are the distributions of requirements changes by 
type? 

GOAL 6: Compare development and maintenance processes. 
QUESTION 6 

What are the distributions of change effort by activity. 
GOAL 7: Characterize the impact of the delivered product 
on maintenance. 

QUESTION 7 

What product characteristics resulting from reuse have 
* positive/negative effects on maintenance? 

Figure 12. Revised measurement questions for future 
maintenance improvement cycles.' 
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When our empirical investigations identify important 
phenomena, we must refocus our measurement goals 
and questions in order to study the phenomena. For 
example, one hypothesized implication of the stable 
architecture of the maintained systems (very few 
modules are being added or deleted) is that module 
cohesion within these systems may be deteriorating. 
Such deterioration may lead to weaker and weaker 
system architecture, and ultimately lead to even more 
difficult maintenance. Such a hypothesis needs much 
closer investigation before it can be presented as a 
potential problem. 

When measurement does identify specific problems, 
the next step is to analyze the problems and attempt to 
identify viable solutions. For instance, we have quanti- 
fied the types and kinds of faults uncovered during 
maintenance. Next, we might begin to analyze their 
causes in development. Such analysis may lead us to 
mechanisms for preventing faults, or it may help us 
identify better ways of detecting them. 

Finally, the maintenance environment itself is contin- 
ually changing. Transitions to the use of Ada and 
Cleanroom development in the SEL will require peri- 
odic adjustments to our measurement procedures. Such 
changes are not unexpected; in fact, measurement by 
nature must continue to evolve as the environment 
evolves. 

5.3 Measurement Lessons Learned 

The extension of the SEL into maintenance not only 
enabled us to gain experience with maintenance 
but also with establishing a maintenance measurement 
program [25], 

Our first lesson is that there is a distinction, at least 
conceptually, between start-up and routine phases of 
measurement. During the start-up phase, there is con- 
siderable freedom to reevaluate measurement goals and 
redesign the metrics and procedures as our understand- 
ing of the local priorities and what is feasible grows. 
Once data collection forms have been designed and 
reflected in the data base and once people have been 
instructed in the procedures, it becomes expensive to 
introduce further changes. It is therefore critical that 
the start-up phase proceed cautiously. We suggest vali- 
dating all measurement procedures through pilot studies. 

Our second lesson concerns which questions are 
suitable for routine measurement. It may be tempting to 
use routine measurement as a mechanism for answering 
questions that could be resolved more efficiently by 
other means. For example, if the software design docu- 
mentation is never maintained, it would be wasteful to 
discover this via routine data collection. Routine mea- 
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surement is appropriate for monitoring large-scale and 
historical trends, but it is not needed to ascertain simple 
facts. Many of the questions we would like to pursue 
are risky, i.e., we cannot be sure that the resulting data 
will prove useful. 

Third, we have found the establishment of a 
measurement program in a new environment to be 
a time-consuming and sensitive task. Getting the pro- 
gram started requires building initial models of the 
maintenance organization, the maintained products, the 
maintenance processes, and the specific maintenance 
problems at hand. These models are used to design the 
measurement procedures, but must be validated during 
the start-up phase. Special care must also be taken to 
establish the creditability of measurement and win the 
cooperation needed to make the program a success. To 
collect valid data, the people providing most of the data 
need to be well motivated and instructed. Motivation 
requires addressing measurement goals of direct inter- 
est to the people providing cooperation and an opportu- 
nity for these people to review and comment on the 
resulting data and analyses. 

Our analysis results demonstrate the immediate 
returns possible from investment in a measurement 
program. A measurement program provides invaluable 
insight into the processes and products within the given 
environment. As long as measurement is performed 
within a context of well defined goals and questions, 
such a program can be a success for any software 
organization. 
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APPENDIX A: Data Collection Forms 


WEEKLY MAINTENANCE EFFORT FORM 

N a me: 

For Ubranan * Um Only 

Numbor: 

Protect: 

Oslo: 


Friday Data: 

Checked by: 

Section A - Total Hours Spent on Maintenance ooeiudM^^ionx 

•cthrittw for tho profsel «ra*»d*ng writing apoeffleatton modWcottono) 


— n 

Section B — Hours By Class of Malntanancs (Tool or hour* in sscaon a should seusi >ot»i hours in tscson a) 

Class 

Definition 

Hours 

Correction 

Hour* apant on all malntanancs associated with a syatam 
fallura. 


Enhancement 

Hours apant on all malntanancs associated with modifying the 
syatam due to a requirements changa. Includas adding, 
delating, or modifying syatam faatures as a result of a 
requirements change. 


Adaptation 

Hour* rpent on all maintenance associated with modifying a 
system to adapt to a changa in hardware, syatam software, or 
environmental characteristics. 


Other 

Othar hours spent on the project (related to malntanancs) not 
covered above. Includas management, meetings, ate. 


Section C — Hours By Maintenance Activity (Tox or hour* in socson c idoum «qo*i total hour* in ttrtirm a) 

Activity 

Activity Definitions 

Hours 

Isolation 

Hours spent understanding the fallura or request for 
enhancement or adaptation. 


Change 

Daaign 

Hours spent actually redesigning the system based on an 
understanding of the necessary changa. 


Implsmsntatton 

Hours spent changing the syatam to complete the necessary 
changa. This includas changing not only the coda, but the 
associated documentation. 


Unit Tast/ 
Syatam Tast 

Hours spent tasting the changed or added components. 
Includes hours spent tasting the integration of the components. 


Acceptance/ 
Benchmark Tast 

Hours spent acceptance tasting or benchmark tasting the 
modified system. 


Othar 

Othar hours spent on the project (related to maintenance) not 
covered above. Includes management, meetings, etc. 

y 

1 


MAY 1969 

Figure Al. Weekly Maintenance Report Forms. 
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MAINTENANCE CHANGE REPORT FORM 

Mama. OSMR Numbsr 

For Librarian » Um Onfy 

Nurnbar 

DM: 

Project: Date: 

Entarad by 

Chackadbv 

1 t l 


SECTION A: Change Request Information 
Functional Description of Change: — — 


What was the type of modification? 

Correction 

Enhancement 

Adaptation 


SECTION B: Change Implementation Information 

Components Changed/Added/Deleted: — 


What caused the change? 

Requirements/specifications 

Software design 

Code 

Previous change 

Other 


1 hr to 1 day to 1 week to 
< ihr 1 day 1 week 1 month > i month 


Estimate effort spent isolating/determining the change: 
Estimate effort to design, implement, and test the change: 


Chock all changed objects: 

Requirements/Specifications Document 

Design Document 

Code 

System Description 

User's Guide 

Other 


If code changed, characterize the change (cherk most 
sppiicsbls) 

Initialization 

Logic/control structure 

(s.g., changed flow of control) 

Interface (internal) 

(modula to module communication) 

Interface (external) 

(module to external communication) 

Data (value or structure) 

(s.g., variable or value changed) 

Computational 

(e.g., change of math expression) 

Other (none of the above apply) 


Estimate the number of lines of code (including comments): 


added changed deleted 


Enter the number of components: 

added changed deleted 

Entar the number of the added components that are n>w reused WRIT 

modifications 


MAY 1989 

Figure A2. Maintenance Change Report Form. 
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